Picture for Hao Li

Hao Li

Jack

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Add code
May 01, 2025
Viaarxiv icon

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Add code
Apr 30, 2025
Viaarxiv icon

TarDiff: Target-Oriented Diffusion Guidance for Synthetic Electronic Health Record Time Series Generation

Add code
Apr 24, 2025
Viaarxiv icon

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Add code
Apr 15, 2025
Viaarxiv icon

Single View Garment Reconstruction Using Diffusion Mapping Via Pattern Coordinates

Add code
Apr 11, 2025
Viaarxiv icon

GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III

Add code
Apr 08, 2025
Viaarxiv icon

OmniCam: Unified Multimodal Video Generation via Camera Control

Add code
Apr 03, 2025
Viaarxiv icon

VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence

Add code
Apr 03, 2025
Viaarxiv icon

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Add code
Apr 03, 2025
Viaarxiv icon

CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction

Add code
Mar 29, 2025
Viaarxiv icon